Goto

Collaborating Authors

 Dardanelles


RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

Yu, Zichun, Xiong, Chenyan

arXiv.org Artificial Intelligence

High-quality pretraining data is the fossil fuel of large language models (LLMs), yet its reserves are running low for frontier models. In this paper, we introduce RePro, a novel web recycling method that trains a relatively small LM with reinforcement learning to generate effective and faithful rephrasings of pretraining data. Specifically, we design one quality reward and three faithfulness rewards, optimizing the LM rephraser to convert organic data into high-quality rephrasings while maintaining its core semantics and structure. In our experiment, we train a 4B rephraser to recycle 72B tokens sampled from DCLM-RefinedWeb. Pretraining results on 400M and 1.4B models demonstrate that RePro delivers 4.7%-14.0% relative accuracy gains over organic-only baseline on 22 downstream tasks. RePro also outperforms ReWire, the state-of-the-art web recycling method that prompts a 70B rephraser, as well as the organic baseline with a 4x larger data pool. Experiments with different amounts of recycled data highlight that RePro improves organic data efficiency by 2-3x. Individual and distributional analyses validate that RePro preserves more critical information and faithfully reflects the characteristics of organic data compared to prompting-based methods. Together, these results show that RePro provides an efficient and controllable path to effectively harness the fossil fuel of LLM pretraining. We open-source our code, rephraser, and recycled data at https://github.com/cxcscmu/RePro.


MedFormer: a data-driven model for forecasting the Mediterranean Sea

Epicoco, Italo, Donno, Davide, Accarino, Gabriele, Norberti, Simone, Grandi, Alessandro, Giurato, Michele, McAdam, Ronan, Elia, Donatello, Clementi, Emanuela, Nassisi, Paola, Scoccimarro, Enrico, Coppini, Giovanni, Gualdi, Silvio, Aloisio, Giovanni, Masina, Simona, Boccaletti, Giulio, Navarra, Antonio

arXiv.org Artificial Intelligence

Accurate ocean forecasting is essential for supporting a wide range of marine applications. Recent advances in artificial intelligence have highlighted the potential of data-driven models to outperform traditional numerical approaches, particularly in atmospheric weather forecasting. However, extending these methods to ocean systems remains challenging due to their inherently slower dynamics and complex boundary conditions. In this work, we present MedFormer, a fully data-driven deep learning model specifically designed for medium-range ocean forecasting in the Mediterranean Sea. MedFormer is based on a U-Net architecture augmented with 3D attention mechanisms and operates at a high horizontal resolution of 1/24°. The model is trained on 20 years of daily ocean reanalysis data and fine-tuned with high-resolution operational analyses. It generates 9-day forecasts using an autoregressive strategy. The model leverages both historical ocean states and atmospheric forcings, making it well-suited for operational use. We benchmark MedFormer against the state-of-the-art Mediterranean Forecasting System (MedFS), developed at Euro-Mediterranean Center on Climate Change (CMCC), using both analysis data and independent observations. The forecast skills, evaluated with the Root Mean Squared Difference and the Anomaly Correlation Coefficient, indicate that MedFormer consistently outperforms MedFS across key 3D ocean variables. These findings underscore the potential of data-driven approaches like MedFormer to complement, or even surpass, traditional numerical ocean forecasting systems in both accuracy and computational efficiency.


Regional Ocean Forecasting with Hierarchical Graph Neural Networks

Holmberg, Daniel, Clementi, Emanuela, Roos, Teemu

arXiv.org Artificial Intelligence

Accurate ocean forecasting systems are vital for understanding marine dynamics, which play a crucial role in environmental management and climate adaptation strategies. Traditional numerical solvers, while effective, are computationally expensive and time-consuming. Recent advancements in machine learning have revolutionized weather forecasting, offering fast and energy-efficient alternatives. Building on these advancements, we introduce SeaCast, a neural network designed for high-resolution, medium-range ocean forecasting. SeaCast employs a graph-based framework to effectively handle the complex geometry of ocean grids and integrates external forcing data tailored to the regional ocean context. Our approach is validated through experiments at a high spatial resolution using the operational numerical model of the Mediterranean Sea provided by the Copernicus Marine Service, along with both numerical and data-driven atmospheric forcings.


Opening the Black-Box: A Systematic Review on Explainable AI in Remote Sensing

Höhl, Adrian, Obadic, Ivica, Torres, Miguel Ángel Fernández, Najjar, Hiba, Oliveira, Dario, Akata, Zeynep, Dengel, Andreas, Zhu, Xiao Xiang

arXiv.org Artificial Intelligence

In recent years, black-box machine learning approaches have become a dominant modeling paradigm for knowledge extraction in Remote Sensing. Despite the potential benefits of uncovering the inner workings of these models with explainable AI, a comprehensive overview summarizing the used explainable AI methods and their objectives, findings, and challenges in Remote Sensing applications is still missing. In this paper, we address this issue by performing a systematic review to identify the key trends of how explainable AI is used in Remote Sensing and shed light on novel explainable AI approaches and emerging directions that tackle specific Remote Sensing challenges. We also reveal the common patterns of explanation interpretation, discuss the extracted scientific insights in Remote Sensing, and reflect on the approaches used for explainable AI methods evaluation. Our review provides a complete summary of the state-of-the-art in the field. Further, we give a detailed outlook on the challenges and promising research directions, representing a basis for novel methodological development and a useful starting point for new researchers in the field of explainable AI in Remote Sensing.


War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars

Hua, Wenyue, Fan, Lizhou, Li, Lingyao, Mei, Kai, Ji, Jianchao, Ge, Yingqiang, Hemphill, Libby, Zhang, Yongfeng

arXiv.org Artificial Intelligence

Can we avoid wars at the crossroads of history? This question has been pursued by individuals, scholars, policymakers, and organizations throughout human history. In this research, we attempt to answer the question based on the recent advances of Artificial Intelligence (AI) and Large Language Models (LLMs). We propose \textbf{WarAgent}, an LLM-powered multi-agent AI system, to simulate the participating countries, their decisions, and the consequences, in historical international conflicts, including the World War I (WWI), the World War II (WWII), and the Warring States Period (WSP) in Ancient China. By evaluating the simulation effectiveness, we examine the advancements and limitations of cutting-edge AI systems' abilities in studying complex collective human behaviors such as international conflicts under diverse settings. In these simulations, the emergent interactions among agents also offer a novel perspective for examining the triggers and conditions that lead to war. Our findings offer data-driven and AI-augmented insights that can redefine how we approach conflict resolution and peacekeeping strategies. The implications stretch beyond historical analysis, offering a blueprint for using AI to understand human history and possibly prevent future international conflicts. Code and data are available at \url{https://github.com/agiresearch/WarAgent}.


Self-Prompting Large Language Models for Zero-Shot Open-Domain QA

Li, Junlong, Zhang, Zhuosheng, Zhao, Hai

arXiv.org Artificial Intelligence

Open-Domain Question Answering (ODQA) aims at answering factoid questions without explicitly providing specific background documents. In a zero-shot setting, this task is more challenging since no data is available to train customized models like Retriever-Readers. Recently, Large Language Models (LLMs) like GPT-3 have shown their power in zero-shot ODQA with direct prompting methods, but these methods are still far from releasing the full powerfulness of LLMs only in an implicitly invoking way. In this paper, we propose a Self-Prompting framework to explicitly utilize the massive knowledge stored in the parameters of LLMs and their strong instruction understanding abilities. Concretely, we prompt LLMs step by step to generate multiple pseudo QA pairs with background passages and explanations from scratch and then use those generated elements for in-context learning. Experimental results show our method surpasses previous SOTA methods significantly on three widely-used ODQA datasets, and even achieves comparable performance with some Retriever-Reader models fine-tuned on full training data.


Bridging History with AI A Comparative Evaluation of GPT 3.5, GPT4, and GoogleBARD in Predictive Accuracy and Fact Checking

Tasar, Davut Emre, Tasar, Ceren Ocal

arXiv.org Artificial Intelligence

The rapid proliferation of information in the digital era underscores the importance of accurate historical representation and interpretation. While artificial intelligence has shown promise in various fields, its potential for historical fact-checking and gap-filling remains largely untapped. This study evaluates the performance of three large language models LLMs GPT 3.5, GPT 4, and GoogleBARD in the context of predicting and verifying historical events based on given data. A novel metric, Distance to Reality (DTR), is introduced to assess the models' outputs against established historical facts. The results reveal a substantial potential for AI in historical studies, with GPT 4 demonstrating superior performance. This paper underscores the need for further research into AI's role in enriching our understanding of the past and bridging historical knowledge gaps.


What do we know about Ukraine's use of Turkish Bayraktar drones?

Al Jazeera

During Russia's war on Ukraine, video footage has circulated on the internet showing the Turkish combat drone Bayraktar TB2 successfully striking the Russian army. But as so often during heightened conflicts, it is hard to distinguish between factual events and misinformation – some videos of the drone attacks have already been exposed as the latter. Given the chaotic events on the ground, it is almost impossible to assess how often and how successfully Ukraine has utilised its Turkish drones so far, Mauro Gilli, senior researcher in military technology and international security at ETH Zurich, told Al Jazeera. "There have been some video footages allegedly showing the employment of the TB2. Of course, information at this point is fragmented, and it needs to be taken with caution. "We do know that Ukraine bought some TB2 over the past years and that Turkey and Ukraine signed an agreement for the production within Ukrainian borders of the TB2 – but, as far as I know, production had not started yet.


LOGLISP: an alternative to PROLOG

Robinson, J.A.

Classics

Our own early attempts (as devoted users of LISP) to use PROLOG convinced us that it would be worth the effort to create within LISP a faithful implementation of Kowalski's logic programming idea. We felt it would be very convenient to be able to set up a knowledge base of assertions inside a LISP workspace, and to compute the answers to queries simply by executing appropriate function calls.In Hayes, J. E., Michie, D., and Pao, Y.-H. (Eds.), Machine Intelligence 10. Ellis Horwood.